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The  system  that  we  have  developed  with  ARO  support  during  the  past  three  years  convincingly 
demonstrates  the  technology  required  to 

•  manage  complex  user  requests  within  the  context  of  a  large  tool  suite, 

•  allow  users  to  obtain  instruction  about  how  to  make  complex  requests  from  the  system  itself,  and 

•  easily  develop  processors  for  problem-oriented  specification  languages. 

We  have  used  the  system  routinely  in  its  own  development  and  in  a  variety  of  applications  from  circuit 
board  layout  through  robot  control  to  database  query  specification.  It  is  now  available  for  distribution  to 
sites  with  Sun3  and  Sun4  computers. 

This  report  summarizes  the  most  important  results  of  our  research,  and  places  them  in  the  context  of 
the  general  problem  of  transitioning  the  technology  represented  by  large  software  systems.  The  work 
described  here  relates  directly  to  the  growing  problem  of  moving  results  out  of  the  laboratory  into  day-to- 
day  life.  We  show  how  the  inevitable  complexity  of  a  modular  collection  of  processes  can  be  hidden  from 
the  user  by  an  expert  system  that  understands  their  low-level  relationships.  That  expert  system  can  observe 
the  behavior  of  the  hidden  processes  and  interpret  their  actions  in  terms  the  user  understands.  It  can  also 
call  on  extensive  tutorial  material,  including  exercises  for  the  user,  to  aid  the  user  in  completing  a  task.  In 
order  to  demonstrate  these  principles,  we  have  used  them  to  construct  a  system  that  solves  a  real-life  prob¬ 
lem  —  construction  of  processors  for  problem -oriented  specification  languages. 


1.  The  Problem  —  Using  Computers  More  Effectively 

Any  computer,  from  a  small  PC  to  a  large  mainframe,  provides  an  environment  within  which  a  user 
makes  requests.  The  requests  may  simply  extract  information  from  the  environment  (e.g.  a  request  to 
obtain  the  size  of  a  file)  or  they  may  alter  the  environment  (e.g.  a  request  to  compile  a  program  and  store 
the  object  file).  Some  requests  are  simple  lo  implement  and  others  are  mote  complex,  but  each  appears  to 
the  user  as  an  atomic  operation  that  carries  out  some  useful  task.  One  way  to  use  the  computer  more  effec¬ 
tively  is  to  "package”  commonly-used  combinations  of  requests  as  single  requests.  Most  operating  sys¬ 
tems  provide  command  script  mechanisms  for  this  purpose,  and  special  tools  like  the  Unix  make1  may  also 
be  available.  Unfortunately,  neither  of  these  approaches  is  entirely  satisfactory.  Command  scripts  are  too 
rigid,  and  make-like  tools  neither  provide  sufficient  parameterization  nor  sufficient  hiding  of  intermediate 
products. 


Many  requests  made  of  computers  require  large-scale  parameterization:  The  class  of  problems  to  be 
solved  by  the  request  is  well-understood,  but  there  are  a  large  number  of  problem  instances  that  must  be 
solved  in  slightly  different  ways.  A  typical  example  of  such  a  situation  is  a  request  to  sort  some  data  file. 
The  general  sorting  problem  is  well-understood,  but  a  particular  solution  depends  upon  many  things  (collat¬ 
ing  sequence,  primary  and  secondary  fields,  characters  to  be  ignored,  etc.)  The  general  approach  to  such 
problems  is  to  create  processor  generators  that  accept  a  description  of  the  problem  instance  and  create  a 
processor  to  handle  requests  for  its  solution.  Problem  descriptions  are  written  in  declarative  notations  that 
have  often  been  called  fourth-generation  languages. 


Creation  of  a  processor  generator  is  basically  a  compiler  construction  task.  The  fourth-generation 
problem-description  language  must  be  designed,  and  a  program  built  to  analyze  descriptions  in  this 
language  and  produce  processors  that  solve  the  problems  described.  Requiring  that  all  processor  genera¬ 
tors  be  built  by  people  trained  in  compiler  construction  would,  however,  seriously  limit  the  availability  of 
this  problem  solving  technique.  It  is  much  better  to  provide  people  who  need  processor  generators  with  the 
ability  to  construct  them  directly. 

The  goal  of  our  research  project  was  a  system  that  would  allow  a  person  needing  a  processor  genera¬ 
tor  to  write  declarative  specifications  describing  the  desired  fourth-generation  language  and  its  translation,  □ 

and  then  request  construction  of  the  specified  processor  generator.  In  order  to  produce  such  a  system,  we  ' - 

had  to  manage  the  request  to  produce  the  processor  generator  from  its  specification  (which  is  very  com- - 

plex,  but  must  appear  atomic  to  the  user).  In  order  to  make  the  system  usable  to  people  other  than  its _ 

designers,  we  had  to  provide  easy  ways  for  users  to  learn  to  use  it  and  to  obtain  aid  in  interpreting  diagnos-  , 

tic  output  - 
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2.  Management  of  Complex  User  Requests 

Our  first  problem  was  how  to  manage  a  request  for  processor  generation.  We  knew  that  such  a 
request  would  involve  activation  of  a  number  of  separate  tools  to  create  processor  components  and  to 
integrate  those  components  into  a  functioning  whole.  Many  of  the  individual  took  would  be  obtained  Grom 
other  sources,  because  the  cost  of  redeveloping  them  in-house  would  be  prohibitive.  That  meant  that  what¬ 
ever  management  mechanism  we  chose  could  not  require  specific  tool  interfaces.  Since  the  users  of  the 
system  would  not  be  experts  in  compiler  construction  it  was  important  to  hide  both  the  took  and  the  inter¬ 
mediate  products.  Although  we  needed  to  allow  the  user  to  create  simple  processors  simply,  facilities  also 
needed  to  be  available  for  creation  of  complex  processors  that  did  not  precisely  fit  a  standard  model.  Thus 
the  creation  request  needed  to  be  parameterizable.  Finally,  the  management  tool  needed  to  be  flexible  so 
that  we  could  add  new  took  and  change  existing  took  on  the  bask  of  our  experience  during  the  project. 

We  chose  Odin,2,3  an  expert  system  whose  domain  of  expertise  k  management  of  complex  user 
requests,  as  our  management  tool.4  Odin,  like  any  expert  system,  separates  the  inference  engine  that 
manages  the  user’s  request  from  the  knowledge  base  that  specifics  the  management  policies.  The  manufac¬ 
ture  of  a  processor  k  described  by  a  derivation  graph. ,  a  declarative  specification  from  which  Odin’s 
knowledge  base  can  be  generated.  Because  the  derivation  graph  is  a  declarative  specification  rather  than  a 
program,  it  k  easy  to  change  and  can  be  checked  for  consistency.  This  allows  us  the  required  flexibility  to 
add  and  alter  took.  Odin’s  inference  engine  k  completely  independent  of  the  processor-generation  prob¬ 
lem,  and  could  be  used  to  manage  any  collection  of  software  took. 

Nodes  in  the  derivation  graph  represent  either  took  or  objects.  Took  are  actually  implemented  by 
operating  system  command  scripts.  Odin  allows  the  node  to  choose  the  command  processor  it  will  use,  the 
directory  in  which  it  will  run,  and  the  commands  it  will  execute.  Thus  the  actions  taken  by  a  node  can  be 
tailored  to  the  tool  it  invokes,  permitting  us  to  employ  arbitrary  took  for  which  only  object  code  k  avail¬ 
able,  or  which  require  input  or  output  format  changes.  Thk  allows  use  of  off-the-shelf  took  and  avoids 
having  to  do  extensive  tool  development  in-house.  Again,  the  behavior  k  independent  of  the  processor- 
generation  problem;  any  collection  of  arbitrary  {nograms  could  be  managed. 

The  derivation  graph  k  normally  invisible  to  the  user,  although  it  k  possible  to  get  explanations  of 
what  Odin  is  doing  in  terms  of  the  derivation  graph.  (A  hallmark  of  an  expert  system  k  that  it  can  explain 
the  reasoning  by  which  it  arrives  at  a  requested  result)  Thk  means  that  we  can  increase  the  complexity  of 
the  manufacturing  process  without  increasing  the  cognitive  load  on  the  user.  Intermediate  products  used 
during  the  manufacture  are  also  invisible.  They  are  kept  in  a  separate  directory  called  the  cache.  Wher¬ 
ever  possible,  Odin  will  use  existing  copies  of  intermediate  object  in  satkfying  requests,  thus  reducing 
manufacturing  costs. 

During  the  grant  period,  we  have  worked  closely  with  the  developer  of  Odin  in  devising  techniques 
for  derivation  graph  specification  that  increase  flexibility  and  simplify  processing.  We  now  have  standard 
approaches  to  the  introduction  of  new  took,  and  very  powerful  mechanisms  for  varying  the  manufacturing 
steps  on  the  bask  of  the  particular  set  of  specifications  supplied  by  the  user.  Thk  latter  improvement  has 
drastically  reduced  the  need  for  user  parameters  to  control  manufacture. 

3.  Error  Reporting  and  Documentation 

Error  reporting  k  a  critical  problem  when  dealing  with  complex  user  requests.  The  situation  is 
analogous  to  that  found  in  a  complex  program  when  a  low-level  component  detects  an  error,  but  docs  not 
have  sufficient  context  to  report  that  error  in  terms  understandable  to  the  user.  The  technique  of  ‘‘unhur¬ 
ried  diagnostics”  was  developed  to  deal  with  this  problem.6  A  program  having  information  about  a  failure 
may  take  any  one  of  four  actions: 

1)  Output  the  information  and  terminate. 

2)  Output  the  information  and  proceed  by  an  alternate  route. 

3)  Suppress  the  information  and  proceed  by  an  alternate  route. 

4)  If  not  the  main  program,  pass  the  information  to  its  caller  and  indicate  failure. 

The  original  version  of  Odin  allowed  each  of  the  first  three  actions  to  be  taken  by  individual  nodes;  the 
fourth  was  added  during  the  grant  period. 


The  important  point  is  that  the  derivation  graph  can  specify  arbitrary  manufacturing  steps  to  be 
applied  to  error  reports.  Those  manufacturing  steps  can  access  arbitrary  information  about  both  the  deriva¬ 
tion  graph  and  the  current  contents  of  the  cache.  Thus  it  is  possible  to  apply  a  complex  interpretation  pro¬ 
cess  to  reports  provided  by  any  tool,  without  altering  the  tool  in  any  way.  A  user  of  the  system  is  unaware 
of  this  interpretation  process,  and  need  know  nothing  about  the  tool  issuing  the  report. 

We  have  increased  the  amount  of  information  available  for  error  reporting  by  placing  the  system 
documentation  on  line  in  hypertext  form.  Not  only  does  this  allow  the  user  to  browse  the  documentation  in 
order  to  answer  question  arising  while  creating  specifications,  but  it  permits  the  error  analysis  to  access  the 
documentation  needed  to  explain  a  failure.  There  is  no  need  for  the  user  to  have  the  printed  form  of  the 
system  documentation  at  hand  because  a  simple  request  will  place  them  in  hypertext  browsing  mode  at  the 
most  likely  explanation  of  their  problem.  If  the  system  is  wrong,  the  user  is  in  a  position  to  browse  the 
entire  set  of  documentation  if  necessary. 

Both  the  hypertext  form  and  the  printed  form  of  the  system  documentation  are  generated  from  a  sin¬ 
gle  body  of  text.  Thus  we  avoid  any  possible  inconsistencies  and  reduce  the  cost  of  producing  the  hyper¬ 
text  to  zero. 

We  have  made  a  modification  to  the  hypertext  browser  that  allows  a  reader  to  modify  and  execute 
examples  given  in  the  documentation.  This  feature  is  the  basis  of  a  system  tutorial  for  self-paced  learning. 
A  set  of  graded  examples  with  accompanying  exercises  introduces  the  new  user  to  the  individual 
specification  methods  supported  by  the  system.  Our  experience  has  been  that  this  approach  to  teaching 
people  how  to  use  the  system  is  much  more  effective  than  either  a  printed  manual  alone  or  a  sequence  of 
lectures. 

4.  Processor  Generation 

We  began  this  research  project  with  the  usual  complement  of  compiler  construction  tools:  a  lexical 
analyzer  generator,7  a  parser  generator*  and  an  attribute  grammar  processor.9  After  building  a  derivation 
graph  that  managed  these  tools  and  a  collection  of  scripts  to  make  them  work  together  we  had  a  rudimen¬ 
tary  system  for  generating  language  processors.  There  were  two  basic  deficiencies  in  this  system  that  we 
have  attacked  during  the  grant  period  —  the  weakness  of  the  specification  techniques  and  the  poor  perfor¬ 
mance  of  the  generated  processors. 

If  a  language  processor  is  specified  by  a  set  of  regular  expressions  (input  to  the  lexical  analyzer  gen¬ 
erator),  a  context-free  grammar  (input  to  the  parser  generator)  and  an  attribute  grammar  then  most  of  the 
processor’s  behavior  must  be  specified  algorithmically  within  the  attribute  grammar.  Our  research  has 
identified  several  additional  subproblems  that  can  be  specified  declaratively  instead  of  algorithmically.  We 
have  used  the  system  to  generate  processors  for  fourth-generation  languages  in  which  these  declarative 
specifications  can  be  written.  By  extending  the  derivation  graph,  we  incorporated  these  processors  into  the 
system.  A  user  does  not  need  to  know  of  the  existence  of  the  additional  processors.  If  specifications  in  the 
languages  they  implement  are  provided  to  the  system,  they  are  invoked;  otherwise  they  are  not. 

A  parallel  effort  involves  the  individual  tools  themselves.  In  response  to  experience  with  our  lexical 
analyzer  generator  we  redefined  the  specification  language  to  increase  its  flexibility  and  to  allow  for  library 
specifications.  We  replaced  the  parser  generator  with  a  new  version  from  Germany,  increasing  the  speed 
of  the  generated  processor  by  a  factor  of  five  without  altering  the  specification  language.  The  system  now 
supports  two  attribute  grammar  analyzers,  selecting  one  or  the  other  on  the  basis  of  the  type  of  specification 
file  provided  by  the  user.  Extensive  measurements  of  generated  processors  and  comparisons  with  hand 
coded  versions  have  led  to  modifications  of  the  system  and  better  results. 

One  of  the  more  difficult  tasks  when  specifying  a  language  processor  seems  to  be  the  design  of  the 
context-free  grammar  that  describes  program  structure.  In  order  to  construct  an  efficient  parser  for  the 
language,  the  grammar  must  satisfy  certain  constraints.  When  these  constraints  arc  not  satisfied,  it  is  often 
difficult  for  the  designer  to  determine  how  the  grammar  should  be  changed.  The  constraint  violation  is 
sometimes  the  result  of  a  subtle  ambiguity  in  the  language  definition,  and  sometimes  simply  due  to  a 
blunder  in  writing  the  grammar.  We  have  developed  a  processor  that  is  capable  of  pinpointing  the  most 
common  ambiguities  and  fixing  up  the  blunders.  This  processor  dramatically  reduces  the  difficulty  of 
specifying  grammars. 


5.  Conclusion 

Our  original  proposal  was  to  apply  Artificial  Intelligence  techniques  to  the  problem  of  automatic 
compiler  construction.  We  said  that  the  work  would  result  in  an  integrated  development  environment  that 
provides: 

•  greatly  reduced  compiler  development  time 

•  compilers  as  fast  and  compact  as  those  constructed  by  hand 

•  guaranteed  compiler  reliability 

•  a  reduction  of  human  expertise  required  in  compiler  construction 

•  a  simple  path  for  compiler  maintainability 

•  greatly  reduced  life-cycle  cost 

These  objectives  have  been  met  We  are  currently  distributing  the  system,  called  Eli,  for  Sun3  and  Sun4 
equipment  Our  experience  with  a  wide  variety  of  language  processors  indicates  that  development  time  is 
about  1/3  of  that  required  using  other  approaches.  Our  measurements  of  a  generated  C  compiler  indicate 
that  it  has  approximately  the  same  performance  as  gcc,  a  well-regarded  compiler  that  was  coded  by  hand. 
Compilers  generated  by  Eli  do  not  crash.  They  always  behave  exactly  according  to  the  specifications  from 
which  they  were  generated.  We  have  used  Eli  in  a  project  class  at  the  University  of  Colorado  for  three 
years,  with  students  who  have  no  prior  compiler  construction  experience.  They  are  able  to  create  relatively 
sophisticated  language  processors  without  becoming  compiler  construction  experts.  Since  the  processors 
are  generated  from  specifications,  compiler  maintenance  is  reduced  to  maintenance  of  the  specificatic  is. 
Many  parts  of  these  specifications  are  re-usable,  thus  reducing  the  total  life-cycle  cost  of  each  compiler. 

In  addition  to  providing  the  specific  development  environment  far  language  processors,  we  have 
demonstrated  the  technology  needed  to  construct  such  environments  in  general.  This  technology  can  be 
applied  to  any  situation  in  which  a  number  of  off-the-shelf  software  components  must  be  combined  to 
satisfy  a  single  complex  request  Our  approach  can  be  used  to  create  a  flexible  system  that  provides  error 
reports  at  an  appropriate  level,  extensive  documentation  and  help  facilities,  and  is  capable  of  tutoring  its 
users. 
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